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Abstract — In this paper we propose a new method for solving 
the Automatic Aircraft Recognition (AAR) problem from a 
sequence of images of an unknown observed aircraft. Our method 
exploits the knowledge extracted from a training image data set 
(a set of binary images of different aircrafts observed under three 
different poses) with the fusion of information of multiple features 
drawn from the image sequence using Dezert-Smarandache 
Theory (DSmT) coupled with Hidden Markov Models (HMM). 
The first step of the method consists for each image of the 
observed aircraft to compute both Hu’s moment invariants (the 
first features vector) and the partial singular values of the outline 
of the aircraft (the second features vector). In the second step, 
we use a probabilistic neural network (PNN) based on the 
training image dataset to construct the conditional basic belief 
assignments (BBA’s) of the unknown aircraft type within the set 
of a predefined possible target types given the features vectors 
and pose condition. The BBA’s are then combined altogether by 
the Proportional Conflict Redistribution rule #5 (PCR5) of DSmT 
to get a global BBA about the target type under a given pose 
hypothesis. These sequential BBA’s give initial recognition results 
that feed a HMM-based classifier for automatically recognizing 
the aircraft in a multiple poses context. The last part of this 
paper shows the effectiveness of this new Sequential Multiple- 
Features Automatic Target Recognition (SMF-ATR) method with 
realistic simulation results. This method is compliant with real- 
time processing requirement for advanced AAR systems. 
Keywords: Information fusion; DSmT; ATR; HMM. 

I. Introduction 

ATR (Automatic Target Recognition) systems play a ma- 
jor role in modern battlefield for automatic monitoring and 
detection, identification and for precision guided weapon as 
well. The Automatic Aircraft Recognition (AAR) problem is 
a subclass of the ATR problem. Many scholars have made 
extensive explorations for solving ATR and AAR problems. 
The ATR method is usually based on target recognition using 
template matching [1], [2] and single feature (SF) extraction 
[3]— [7] algorithms. Unfortunately, erroneous recognition often 
occurs when utilizing target recognition algorithms based on 
single feature only, specially if there exist important changes in 
pose and appearance of aircrafts during flight path in the image 
sequence. In such condition, the informational content drawn 
from single feature measures cannot help enough to make a 
reliable classification. To overcome this serious drawback, new 
ATR algorithms based on multiple features (MF) and fusion 
techniques have been proposed [8]— [12], An interesting MF- 
ATR algorithm based on Back-Propagation Neural Network 



(BP-NN), and Dempster-Shafer Theory (DST) of evidence 
[23] has been proposed by Yang et al. in [11] which has been 
partly the source of inspiration to develop our new improved 
sequential MF-ATR method presented here and introduced 
briefly in [12] (in Chinese). In this paper we will explain in 
details how our new SMF-ATR method works and we evaluate 
its performances on a typical real image sequence. 

Although MF-ATR approach reduces the deficiency of SF- 
ATR approach in general, the recognition results can some- 
times still be indeterminate form a single image exploitation 
because the pose and appearance of different kinds of air- 
crafts can be very similar for some instantaneous poses and 
appearances. To eliminate (or reduce) uncertainty and improve 
the classification, it is necessary to exploit a sequence of 
images of the observed aircraft during its flight and devel- 
op efficient techniques of sequential information fusion for 
advanced (sequential) MF-ATR systems. Two pioneer works 
on sequential ATR algorithms using belief functions (BF) 
have been proposed in last years. In 2006, Huang et al. in 
[13] have developed a sequential ATR based on BF, Hu’s 
moment invariants (for image features vector), a BP-NN for 
pattern classification, and a modified Dempster-Shafer (DS) 
fusion rule 1 . A SF-ATR approach using BF, Hu’s moment 
invariants, BP-NN and DSmT rule has also been proposed 
in [ 14] the same year. In these papers, the authors did clearly 
show the benefit of the integration of temporal SF measures 
for the target recognition, but the performances obtained were 
still limited because of large possible changes in poses and 
appearances of observed aircrafts (specially in high maneuver 
modes as far as military aircrafts are under concern). The 
purpose of this paper is to develop a new (sequential) MF-ATR 
method able to provide a high recognition rate with a good 
robustness when face to large changes of poses and ppearances 
of observed aircraft during its flight. 

The general principle of our SMF-ATR method is shown on 
Fig. 1. The upper part of Fig. 1 consists in Steps 1 & 2, whereas 
the lower part of Fig. 1 consists in Steps 3 & 4 respectively 
described as follows; 

• Step 1 (Features extraction) : We consider and extract 
only two features vectors in this work 2 (Hu’s moment 

'called the abortion method by the authors. 

2 The introduction of extra features is possible and under investigations. 
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Fig. 1: General principle of our sequential MF-ATR approach. 



invariants vector, and Singular Values Decomposition 
(SVD) features vector) from the binary images 3 

• Step 2 (BBA’s construction 4 ) : For every image in the se- 
quence and from their two features vectors, two Bayesian 
BBA’s on possible (target type,target pose) are computed 
from the results of two PNN’s trained on the image 
dataset. The method of BBA construction is different 
from the one proposed in [12]. 

• Step 3 (BBA’s combination) : For every image, say the 
k - th image, in the sequence, the two BBA’s of Step 2 
are combined with the PCR5 fusion rule, from which a 
decision Ok on the most likely target type and pose is 
drawn. 

• Step 4 (HMM-based classifier) : From the sequence 
O k = {Oi, ... ,Ok ■■■ , Ok} of K local decisions com- 
puted at Step 3, we feed several HMM-based classifiers 
in parallel (each HMM characterizes each target type) 
and we find finally the most likely target observed in the 
image sequence which gives the output of our SMF-ATR 
approach. 

The next section presents each step of this new SMF-ATR 
approach. Section 3 evaluates the performances of this new 
method on real image datasets. Conclusions and perspectives 
of this work are given in Section 4. 

II. The sequential MF-ATR approach 

In this section we present the aforementioned steps neces- 
sary for the implementation of our new SMF-ATR method. 

3 In this work, we use only with binary images because our image training 
dataset contains only binary images with clean backgrounds, and working 
with binary images is easier to do and requires less computational burden 
than working with grey-level or color images. Hence it helps to satisfy real- 
time processing. The binarization of the images of the sequence under analysis 
is done with the the Flood Fill Method explained in details in [22] using the 
point of the background as a seed for the method. 

4 The mathematical definition of a BBA is given in Section II-C. 



A. Step 1: Features extraction from binary image 

Because Aircraft poses in a flight can vary greatly, we need 
image features that are stable and remain unchanged under 
translation, rotation and scaling. In terms of aircraft features, 
two categories are widely used: 1) moment features and 2) 
contour features. Image moments have been widely used since 
a long time specially for pattern-recognition applications [16]. 
Moment features which are the descriptions of image regional 
characteristics are mainly obtained from the intensity of each 
pixel of target image. Contour features are extracted primarily 
by discretizing the outline contour and they describe the 
characteristic of the outline of the object in the image. In terms 
of moment features, Hu’s moment invariants [6] are used here. 
As contour features, we use the SVD [15] of outlines extracted 
from the binary images. 



• Hu’s moments 

Two-dimensional ( p + r/ j -th order moments for p,q = 
0, 1, 2, ... of an image of size M x N are defined as follows: 

M N 

m pq = ^2 ^2 mV n q f[vn,n ) (1) 

m= 1 n—1 

where /(m, n) is the value of the pixel (m,n) of the binary 
image. Note that m pq may not be invariant when /(to, n) by 
translation, rotating or scaling. The invariant features can be 
obtained using the ( p + <y)-th order central moments fi pq for 
p,q = 0 , 1 , 2, ... defined by 

M N 

F P q-^^(m-x) P (n-y) q f(rn,n) (2) 

m= 1 n—1 



where x, and y are the barycentric coordinates of image (i.e. 
the centroid of the image). These values are computed by 



= h £m=l £n=l 171 x /( m > n ) and y = 



tr, — m Ql — 



moo 



~ _ mio _ 

moo 

<? ]C m = iEn=i n x /(m,n), where C is a normalization 

constant given by C = m 00 = E^iEti/K")- The 
centroid moments y pq is equivalent to the m pq moment whose 
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center has been shifted to the centroid of the image. Therefore, 
fjjp q are invariant to image translations. Scale invariance is ob- 
tained by normalization [6]. The normalized central moments 
p pq are defined for p + q = 2 , 3 , . . . by p pq = p pq /p,Q 0 , with 
7 = (p+q + 2 )/ 2 . Based on these normalized central moments 
Hu in [ 16 ] derived seven moment invariants that are unchanged 
under image scaling, translation and rotation as follows 

$1 = V 20 + V02 

$2 = (720 — V02) 2 + 477H 

$3 = (730 - 3 r;i 2 ) 2 + (37721 - V03) 2 

$4 = (t ?30 + 7712) 2 + (7721 + 7703) 2 

<E>5 = (7730 - 3 t 7!2 ) (7730 + 7712)1(7730 + 7712) 2 - 3(?72i + 7703) 2 ] 

+ (37721 - 7703) (7721 + 7703) [3(7730 + 7712) 2 - (7721 + ?7o3 ) 2 ] 
<E>e = (7720 - 7702) [(?730 + m2) 2 ~ (7721 + »703 ) 2 ] 

+ 47711(7730 + 7712) (?72i + 7703) 

$7 = (37721 - ?703)(?730 + 7712)1(7730 + 7712) 2 - 3(7721 + 7703) 2 ] 

- (»?30 - 37712) (7721 + 7703) [3(7730 + 7712) 2 - (7703 + 7721 ) 2 ] 



In this work, we use only the four simplest Hu’s moments to 
compute, that is = [$1 $2 T3 $4], to feed the first PNN 
of our sequential MF-ATR method 5 . 



• SVD features of the target outline 

The SVD is widely applied signal and image processing 
because it is an efficient tool to solve problems with least 
squares method [ 21 ]. The SVD theorem states that if A mxn 
with m > n (representing in our context the original binary 
data) is a real matrix 6 , then it can be written using a so-called 
singular value decomposition of the form 



A — TT S V T 

- ri - mxn — v -'mxm kJ mxn v nxn 

where U mxm and V nxra are orthogonal 7 matrices. The 
columns of U are the left singular vectors. V 2 has rows that 
are the right singular vectors. The real matrix S has the same 
dimensions as A and has the form 8 

S S rxr 0rx(ra— r) 

mxn 

" rx(m—r ) '-'(m— r) X (n— r) 



where S rxr = Diag{ai, <72, ■ ■ ■ , ay} with 01 > 02, > . . . > 
oy > 0 and 1 < r < min(m, n). 

Calculating the SVD consists of finding the eigenvalues and 
eigenvectors of AA 1 and A' 1 A. The eigenvectors of A 7 A 
make up the columns of V, the eigenvectors of AA' J make 
up the columns of U. The singular values 01,..., oy are the 
diagonal entries of S ,. x r arranged in descending order, and 
they are square roots of eigenvalues from AA 1 or A 7 A. 

A method to calculate the set of discrete points 
{di, a 2 , . . . , a n } of a target outline from a binary image 
is proposed in [ 17 ]. The SVD features are then computed 



5 It is theoretically possible to work with all seven Hu’s moments in our 
MF-ATR method, but we did not test this yet in our simulations. 

6 For a complex matrix A, the singular value decomposition is A = 
USV^, where is the conjugate transpose of V. 

7 They verify U^ xm U mxm = I mX and V nx n Vn X n = I nX „, where 

I m x m and I n x n are respectively the identity matrices of dimensions mxm 
and n x 11 . 

^Opxg is ap X (j matrix whose all its elements are zero. 



from the eigenvalues of the circulant matrix built from the 
discretized shape of the outline characterized by the vector 
d = [di, d,2, ■ ■ ■ , d n \ where di is the distance of the centroid 
of the outline to the discrete points a t , i = 1,2 , ... , n of the 
outline. 

In our analysis, it has been verified from our image 
dataset that only the first components of SVD features vector 
er — [01 , 02 , • • ■ , 0 r] take important values with respect to 
the other ones. The other components of er tend quickly 
towards zero. Therefore only few first components of er play 
an important role to characterize the main features of target 
outline. However, if one considers only these few main first 
components of 0, one fails to characterize efficiently some 
specific features (details) of the target profile. By doing so, 
one would limit the performances of ATR. That is why we 
propose to use the partial SVDs of outline as explained in the 
next paragraph. 

To capture more details of aircraft outline with SVD, one 
has to taken into account also additional small singular values 
of SVD. This is done with the following procedure issued from 
the face recognition research community [ 24 ]. The normalized 
distance vector d = [di, d 2 , • • • , d n \ is built from d by 
taking d = [1, c^/di, . . . , d n /d\\, where d\ is the distance 
between the centroid of outline and the first chosen points 
of the contour of the outline obtained by a classical 9 edge 
detector algorithm. To capture the details of target outline and 
to reduce the computational burden, one works with partial 
SVDs of the original outline by considering only l sliding 
sub-vectors of d, where w is the number of components 
of d,,, . For example if one takes w = 3 points only in the 
sub-vectors and if d = [di, , dg], then one will take 

the sub-vectors d}, = [d\,d2,d^], d^, = [d4,d 5 ,d 6 ] and 
d«. = [d 7 ,d 8 ,d 9 ] if we don’t use overlapping components 
between sub-vectors. From the sub-vectors, one constructs 
their corresponding circulant matrix and apply their SVD to 
get partial SVD features vectors er^y 1 , cr l = 2 , etc. The number 
l of partial SVD of the original outline of the target is given 
by l = (n — w)/{w — m) + 1, where m is the number of 
components overlapped by each two adjacent sub-vectors, and 
n is the total number of discrete contour points of the outline 
given by the edge detector. 

B. Step 2: BBA’s construction with PNN’s 

In order to exploit efficiently fusion rules dealing with 
conflicting information modeled by belief mass assignments 
(BBA’s) [ 18 ], [ 23 ], we need to build BBA’s from all features 
computed from images of the sequence under analysis. The 
construction of the BBA’s needs expert knowledge or knowl- 
edge drawn from training using image dataset. In this paper, 
we propose to utilize probabilistic neural networks (PNN) 
initially developed in nineties by Specht [ 19 ] to construct the 
BBA’s because it is a common technique used in the target 
recognition and pattern classification community that is able to 

7 In this work, we use the cvcontour function of opencv software [22] to 
extract the target outline from a binary image. 
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achieve with large training dataset performances close to those 
obtained by a human expert in the field. The details of PNN’s 
settings for BBA’s construction are given in [12]. However, 
because the neural network after training to some extent has a 
good discriminant ability (close to an expert in the field), the 
BBA is constructed by the neural network directly based on 
the PNN’s output, which is different from the construction of 
the BBA based on the confusion matrix described in [12], 

Here we present how the two PNN’s (shown in Figure 1) 
work. In our application, we have N c = 7 types of aircrafts 
in our training image dataset. For each type, the aircraft is 
observed with N p = 3 poses. Therefore we have N cp = N c x 
N p = 21 types of distinct cases in our dataset. For each case, 
one has A ? , = 30 images available for the training. Therefore 
the whole training dataset contains N cpi = N c N p Nj = 7 x 
3 x 30 = 630 binary images. For the first PNN (fed by Hu’s 
features vector), the number of input layer neurons is 4 because 
we use only T> = [$ 1; $2, $3, 4)4] Hu’s moment invariants in 
this work. For the second PNN (fed by partial SVD features 
vector), the number of input layer neurons is constant and 
equal to l x w because we take l windows with the width 
w (so one has w singular values of partial SVD for every 
window). The number of hidden layer neurons of each PNN is 
the number of the training samples, N cp i = 630. The number 
of output layer neurons is equal to N cp = 21 (the number of 
different possible cases). 

Our PNN’s fed by features input vectors (Hu’s moments 
and SVD outline) do not provide a hard decision on the type 
and pose of the observed target under analysis because in our 
belief-based approach we need to build BBA’s. Therefore the 
competition function of the output layer for decision-making 
implemented classically in the PNN scheme is not used in 
the exploitation 10 phase of our approach. Instead, the PNN 
computes the N cp x N t (Euclidean) distances between the 
features vectors of the image under test and the N cp i = 630 
features vectors of the training dataset. A Gaussian radial 
basis function (G-RBF) is used in the hidden layer of the 
PNN’s [19] to transform its input (Euclidean) distance vector 
of size 1 x N cpi into another 1 x N cpi distance (similarity) 
that feeds the output layer through a weighting matrix of size 
N cp i x N cp = 630 x 21 estimated from the training samples. As 
a final output of each PNN, we get an unnormalized similarity 
vector m of size (1 x N cpi ) x (N cpl x N cp ) = 1 x N cp = 1x21 
which is then normalized to get a Bayesian BBA on the frame 
of discernment 0 = {(tar get,; , pose :) ) , i = l, . . . ,c.j = 
1 , ,p}. Because we use only two 11 PNN’s in this approach, 
we are able to build two Bayesian BBA’s toi(.) and m 2(.) 
defined on the same frame 0 for every image of the sequence 
to analyze. 

C. Step 3: Fusion of BBA’s and local decision 

A basic belief assignment (BBA), also called a (belief) mass 
function, is a mapping m(.) : 2 e 1— h [0; 1] such that rn(0) = 0 

10 when analyzing a new sequence of an unknown observed aircraft. 

1 1 A first PPN fed by Hu’s features, and a second PNN fed by SVD outline 
features - see Fig. 1. 



and J2xG2 e m (X) = 1, where 0 is the so-called frame of 
discernment of the problem under concern which consists of 
a finite discrete set of exhaustive and exclusive hypotheses 12 
6i,i = 1 , ,n, and where 2 e is the power-set of 0 (the set of 
all subsets of 0). This definition of BBA has been introduced 
in Dempster-Shafer Theory (DST) [23], The focal elements 
of a BBA are all elements X of 2 e such that m(X) > 0. 
Bayesian BBA’s are special BBA’s having only singletons (i.e. 
the elements of 0) as focal elements. 

In DST, the combination of BBA’s is done by Dempster’s 
rule of combination [23] which corresponds to the normalized 
conjunctive consensus operator. Because this fusion rule is 
known to be not so efficient (both in highly and also in low 
conflicting) in some practical situations [25], many alternative 
rules have been proposed during last decades [18], Vol. 2. 

To overcome the practical limitations of Shafers’ model 
and in order to deal with fuzzy hypotheses of the frame, 
Dezert and Smarandache have proposed the possibility to 
work with BBA’s defined on Dedekind’s lattice 13 D e [18] 
(Vol.l) so that intersections (conjunctions) of elements of the 
frame can be allowed in the fusion process, with eventually 
some given restrictions (integrity constraints). Dezert and 
Smarandache have also proposed several rules of combination 
based on different Proportional Conflict Redistribution (PCR) 
principles. Among these new rules, the PCR5 and PCR6 rules 
play a major role because they do not degrade the specificity of 
the fusion result (contrariwise to most other alternative rule), 
and they preserve the neutrality of the vacuous BBA 14 . PCR5 
and PCR6 provide same combined BBA when combining 
only two BBA’s mi(.) and 7712 (•)’ but they differ when 
combining three (or more) BBA’s altogether. It has been 
recently proved in [26] that PCR6 is consistent with empirical 
( frequentist) estimation of probability measure, unlike other 
fusion rules 15 . These two major differences with DST, make 
the basis of Dezert-Smarandache Theory (DSmT) [18], 

In the context of this work, we propose to use PCR5 to 
combine the two (Bayesian) BBA’s mi(.) and built from 
the two PNN’s fed by Hu’s features vector and SVD outline 
features vector. Because for each image of the observed target 
in the sequence, one has only two BBA’s to combine, the PCR5 
fusion result is same as the PCR6 fusion result. Of course, 
if one wants to include other kinds of features vectors with 
additional PNN’s, the PCR6 fusion rule is recommended. The 
PCR principle consists in redistributing the partial conflicting 
masses 16 only to the sets involved in the conflict and propor- 
tionally to their mass. The PCR5 (or PCR6) combination of 



-This is what is called Shafer's model of the frame in the literature. 
13 Dedekind’s lattice is the set of all composite subsets built from elements 
of 0 with U and fl operators. 

14 A vacuous BBA is the BBA such that m(0) = 1. 

I5 except the averaging rule. 

16 For two BBA’s, a partial conflicting mass is a product mi (X)m. 2 (Y) > 
0 of the element Xfl Y which is conflicting, that is such that X n Y = 0. 
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two BBA’s is done according to the following formula 17 [18] 

m PCR5/6(X) = ^ mipfi)rn 2 (X 2 )+ 

Xi ,X 2 G2 & 

x 1 rx 2 =x 

SX mi (X) 2 m 2 (Y) m 2 (X) 2 m 1 (Y) 

Ye 2®\{X} mi W + m2 ( y ) m 2 ( X ) + ™l( y ) 

xnr=0 

where all denominators in (3) are different from zero, and 
m PCR 5 / 6 ($) = 0. If a denominator is zero, that fraction 
is discarded. All propositions/sets are in a canonical form. 
Because we work here only with Bayesian BBA’s, the previous 
fusion formula is in fact rather easy to implement, see [18] 
(Vol. 2, Chap. 4). 

In summary, the target features extraction in a sequence of 
K images allows us to generate, after Step 3, a set of BBA’s 
{ m image k (^k = 1,2,..., A'}. Every BBA m Imagek (.) is 
obtained by the PCR5/6 fusion of BBA’s rn[ ma9et (.') and 
rrif naaek (.) built from the outputs of two PNN’s. From this 
combined BBA, a local 18 decision Ok can be drawn about 
the target type and target pose in Imagek by taking the focal 
element of m Irnagek (.) having the maximum mass of belief. 

D. Step 4: Hidden Markov Model (HMM) for recognition 

Usually (and specially in military context), the posture of 
an aircraft can continuously change a lot during its flightpath 
making target recognition based only on single observation 
(image) very difficult, because some ambiguities can occur 
between extracted features with those stored in the training 
image data set. To improve the target recognition performance 
and robustness, one proposes to use the sequence of target 
recognition decision Ok drawn from BBA’s {m Ima9ek (.), k = 
1,2,..., K} to feed HMM classifiers in parallel. We suggest 
this approach because the use of HMM has already been 
proved to be very efficient in speech recognition, natural 
language and face recognition. We briefly present HMM, and 
then we will explain how HMMs are used for automatic 
aircraft recognition. 

Let us consider a dynamical system with a finite set of pos- 
sible states S = {si, s 2 , . . . , s^}. The state transitions of the 
system is modeled by a first order Markov chain governed by 
the transition probabilities given by P(s(tk) = Sj\s(t k -i ) = 
Si>s(tk— 2 ) — . . .) P(s(ffc) — Sj\s(tk—l) Si) — (Xij , 

where s(tk ) is the random state of the system at time tk ■ A 
HMM is a doubly stochastic processes including an underlying 
stochastic process (i.e. a Markov chain for modeling the state 
transitions of the system), and a second stochastic process 
for modeling the observation of the system (which is a 
function of the random states of the system). A HMM, denoted 
A = (A, B, n), is fully characterized by the knowledge of the 
following parameters 

17 Here we assume that Shafers’ model holds. The notation m.pci?5/6 
means PCR5 and PCR6 are equivalent when combining two BBA's. 

18 because it is based only on a single image of the unknown observed target 
in the sequence under analysis. 



1) The number N of possible states S = (si, s 2 , . . . , s^} 
of the Markov chain. 

2) The state transition probability matrix 19 A = [a,j\ of 
size N x N, where a^- = P(s(tk ) = Si|s(ffc_i) = Sj). 

3) The prior mass function (pmf) n of the initial state of 

the chain, that is n = { 7 ^, . . . , 7Tjv} with n i = 1> 

where 7Tj = P{s(ti) = s,). 

4) The number M of possible values V = {iq ,...,%} 
taken by the observation of the system. 

5) The conditional pmfs of observed values given the states 
of the system characterized by the matrix B = [b m i] of 
size M x N, with b mi = P(O k = v m \ s(t k ) = sf), 
where Ok is the observation of the system (i.e. the local 
decision on target type with its pose) at time t k . 

In this work we consider a set of N c HMMs in parallel, 
where each HMM is associated with a given type of target 
to recognize. We consider the following state and observation 
models in our HMMs: 

- State model: For a given type of aircraft, we consider a 
finite set of distinct aircraft postures available in our training 
image dataset. In our application, we consider only three states 
corresponding to si = top view , s 2 = side view and S 3 = 
front view as shown (for a particular aircraft) in Figure 2. 



* ^ 



Fig. 2: Example of HMM states. 

- Observation model: In our HMMs, we assume that each 
state (posture) of aircraft is observable. Since we have 
only N p = 3 states S = {si,s 2 ,S 3 } for each aircraft, 
and we have N c = 7 types of aircrafts in the training 
dataset, we have to deal with N cp = 3 x 7 = 21 possible 20 
observations (local decisions) at each time t k ■ As explained 
previously, at the end of Step 3 we have a set of BBA’s 
{77 i Ima ge k q ^ 2 , A"} that helps to draw the sequence 

of local decisions O k = {Oi, . . . , O k , ■ ■ ■ , Ok}- This 
sequence of decisions (called also recognition observations) 
is used to evaluate the likelihood P(0 K \\f) of the different 
HMMs described by the parameter A i = (A,. B,. II,), 

i = 1,2,..., N c . The computation of these likelihoods will 
be detailed at the end of this section. The final decision 
for ATR consists to infer the true target type based on 
the maximum likelihood criterion. More precisely, one will 
decide that the target type is i* if i* = argmax,; P{O k | A*). 

• Estimation of HMM parameters 

To make recognition with HMMs, we need at first to define 
a HMM for each type of target one wants to recognize. 
More precisely, we need to estimate the parameters A, = 

1 7 We assume that the transition matrix is known and time-invariant, i.e. all 
elements ay do not depend on tk — 1 and tk . 

20 We assume that the unknown observed target type belongs to the set of 
types of the dataset, as well as its pose. 
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(A,, B,,n.j), where i = 1 ,...,N C is the target type in the 
training dataset. The estimation of HMM parameters is done 
from observation sequences drawn from the training dataset 
with Baum- Welch algorithm [20] that must be initialized with 
a chosen value A° = (A®, B°, 11°). This initial value is chosen 
as follows: 

1) - State prior probabilities 1 1° for a target of type i: For each 
HMM, we consider only three distinct postures (states) si, s 2 
and S 3 for the aircraft. We use a uniform prior probability 
mass distribution for all types of targets. Therefore, we take 
n° = [1/3, 1/3, 1/3] for any target type i = 1 ,...,N C to 
recognize. 

2) - State transition matrix A° of a target of type i: The 
components a pq of the state transition matrix A ° are estimated 
from the analysis of many sequences 21 of target i as follows 

_ T,k=l S{s(t k ), Sp ) X S(s(tk+ 1 ), Sq) 

M ^K-l X , N X ^ ’ 

Efe=i o{s{t k ),s v ) 

where N p is the number of states of the Markov chain, 
5{x, y) is the Kronecker delta function defined by S(x, y) = 1 
if y = x, and 5{x,y) = 0 otherwise, and where K is 
the number of images in the sequence of target i avail- 
able in the training phase. For example, if in the train- 
ing phase and for a target of type i = 1 , we have the 
following sequence of (target type, pose) cases given by 
[(1, 1), (1, 1), (1, 2), (1, 1), (1, 3), (1, 1), (1, 1)], then from Eq. 
(4) with K = 7, we get 22 



'2/4 1/4 1/4' 
1 0 0 
1 0 0 



3) - Observation matrix B° for a target of type i: The 
initial observation matrix B/ is given by the confusion matrix 
learnt from all images of the training dataset. More precisely, 
from every image of the training dataset, we extract Hu’s 
features and partial SVD outline features and we feed each 
PNN to get two BBA’s according to Steps 1-3. From the 
combined BBA, we make the local decision {tar geti, pose j) if 
m{{targeti,posej)) is bigger than all other masses of belief 
of the BBA. This procedure is applied to all images in the 
training dataset. By doing so, we can estimate empirically 
the probabilities to decide {tar geti, pose j) when real case 
{target^ , pose j>) occurs. So we have an estimation of all com- 
ponents of the global confusion matrix B° = [P {decision = 
{tar geti, posej) \ reality = {target y ,posej>))]. From B° 
we extract the c sub-matrices (conditional confusion matrices) 
B°, i = 1 .... , A’,, by taking all the rows of B° corresponding 
to the target of type i. In our application, one has A/ = 7 
types and N p = 3 postures (states) for each target type, hence 
one has N cp = 7 x 3 = 21 possibles observations. Therefore 
the global confusion matrix B° has size 21 x 21 is the stack 
of N c = 7 sub-matrices B°, i = 1 ,...,N C , each of size 
N p x N cp = 3 x 21. 



21 The video stream of different (known) aircraft flights generate the 
sequences of images to estimate approximately a pq 

22 One verifies that the probabilities of each raw of this matrix sum to 1. 



• Exploitation of HMM for ATR 

Given a sequence O k of K local decisions drawn from the 
sequence of K images, and given N c HMMs characterized by 
their parameter A, (i = 1, . . . , N c ), one has to compute all the 
likelihoods P{O k |A*), and then infer from them the true target 
type based on the maximum likelihood criterion which is done 
by deciding the target type i * if i* = argmax* P{O k |Aj). The 
computation of P{O k |Aj) is done as follows [20]: 

• generation of all possible state sequences of length 

K, Sf = [si(fi)s/(f 2 ) • ■ .Si{t K )\, where si{t k ) G S 
(k=l K) and l = 1, 2, . . . , \S\ K 

• computation of P{O k |Aj) by applying the total proba- 
bility theorem as follows 23 

P {$1 |Aj) = ' a si(ti)si(t 2 ) '■ ■ (5) 

P{0 |Aj,S) ) &Si(tl)Ol ' b S i(tp)02 ' ■ ■ ■ ' bsi(tK)C>K (6) 
\S\ K 

P(0 K \Xi) = P{0 K \\ u S«)P{S«\\i) (7) 

/= 1 

III. Simulations results 

For the simulations of SMF-ATR method, we have used 
N c — 7 types of aircrafts in the training image dataset. Each 
image of the sequence has 1200 x 702 pixels. The sequences 
of aircraft observations in the training dataset take 150 frames. 
The N p = 3 poses of every aircraft is shown in Fig. 3. 
For evaluating our approach, we have used sequences (test 
samples) of images of 7 different aircraft, more precisely 
the Lockheed-F22, Junkers-G.38ce, Tupolev ANT 20 Maxime 
Gorky, Caspian Sea Monster (Kaspian Monster), Mirage-Fl, 
Piaggio PI 80, and Lockheed- Vega, flying under conditions that 
generate a lot of state (posture) changes in the images. The 
number of the images in each sequence to test varies from 
400 to 500. The shaping parameter of the G-RBF of PNN’s 
has been set to 0.1. The simulation is done in two phases: 1) 
the training phase (for training PNN’s and estimating HMM’s 
parameters), and 2 ) the exploitation phase for testing the real 
performances of the SMF-ATR with test sequences. 

A - Performances evaluation 

In our simulations, we have tested SMF-ATR with two 
different fusion rules: 1) the PCR5 rule (see Section II-C), 
and 2) Dempster-Shafer (DS) rule 24 [23]. The percentages of 
successful recognition (i.e. the recognition rate Ri) obtained 
with these two SMF-ATR methods are shown in Table I for 
each type i = 1,2 ,...,N C of aircraft. The performances of 
these SMF-ATR versions are globally very good since one 
is able to recognize with a minimum of 85.2% of success 
the types of aircraft included in the image sequences under 
test when using DS-based SMF-ATR, and with a minimum of 

23 The index i of components of A, and B. matrices has been omitted for 
notation convenience in the last two formulas. 

24 Because Dempster’s rule is one of the basis of Dempster-Shafer Theory, 
we call prefer to call it Dempster-Shafer rule, or just DS rule. This rule 
coincides here with Bayesian fusion rule because we combine two Bayesian 
BBA’s and we don’t use informative priors. 
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Fig. 3: Poses of different types of aircrafts. 



93.5% of success with the PCR5-based SMF-ATR. In term of 
computational time, it takes between 5ms and 6ms to process 
each image in the sequence with no particular optimization 
of our simulation code, which indicates that this SMF-ATR 
approach is close to meet the requirement for real-time aircraft 
recognition. It can be observed that PCR5-based SMF-ATR 
outperforms DS-based SMF-ATR for 3 types of aircraft and 
gives similar recognition rate as with DS-based SMF-ATR for 
other types. So PCR5-based SMF-ATR is globally better than 
DS-based SMF-ATR for our application. 



Target type 


i 


2 


3 


4 


5 


6 


7 


Ri (PCR5 rule) 


95.7 


93.5 


96.3 


98.2 


96.3 


98.5 


97.3 


Ri (DS rule) 


95.7 


93.5 


85.2 


97.8 


96.3 


98.5 


97.2 



TABLE I: Aircraft recognition rates Ri (in %). 



B - Robustness of SMF-ATR to image scaling 

To evaluate the robustness of (PCR5-based) SMF-ATR ap- 
proach to image scaling effects, we did apply scaling changes 
(zoom out) of ZO = 1/2, ZO = 1/4 and ZO = 1/8 in the 
images of the sequences under test. The performances of the 
SMF-ATR are shown in Table II. One sees that the degradation 
of recognition performance of SMF-ATR due to scaling effects 
is very limited since even with a 1/8 zoom out one gets 90% 
of successful target recognition. The performance will decline 
sharply if the targets zoom out goes beyond 1/16. 

C - Robustness to compound type 

Table III gives the performances of SMF-ATR on sequences 
with two types of targets (475 images with type 1, and 382 
images with type 2). 

The two left columns of Table III show the performances 



Target type 


1 


2 


3 


4 


5 


6 


7 


Ri (no ZO) 


95.7 


93.5 


96.3 


98.2 


96.3 


98.5 


97.3 


Ri (ZO=l/2) 


95.0 


92.0 


95.2 


94.7 


96.1 


96.6 


95.4 


R z (ZO=l/4) 


95.0 


92.0 


94.7 


91.7 


93.6 


91.6 


95.7 


Ri (ZO=l/8) 


95.0 


92.2 


93.1 


89.3 


93.6 


94.5 


90.7 



TABLE II: Aircraft recognition rates Ri (in %) of (PCR5/6- 
based) SMF-ATR with different zoom out values. 



Aircraft 


Single 
Type 1 


Single 
Type 2 


Compound 

Type 


Ri (SMF-ATR) 


96.3 % 


98.5% 


97.3% 



TABLE III: Robustness to target compound. 



obtained when recognizing each type separately in each sub- 
sequence. The last column shows the performance when 
recognizing the compound type Type 1 U Type 2. One sees 
that the performance obtained with compound type (97.3%) is 
close to the weighted average 25 97.5% recognition rate. This 
indicates that no wide range of recognition errors occurs when 
the targets type change during the recognition process, making 
SMF-ATR robust to target type switch. 

D - Performances with and without HMMs 

We have also compared the performances of SMF-ATR, 
with two methods using more features but which do not exploit 
sequences of images with HMM. More precisely, the recogni- 
tion is done locally from the combined BBA for every image 
without temporal integration processing based on HMM. We 
call these two Multiple Features Fusion methods MFF1 and 
MFF2 respectively. In MMF1, one uses Hu’s moments, NMI 
(Normalized Moment of Inertia), affine invariant moments, and 
SVD of outline, PNN and PCR5 fusion, whereas MMF2 uses 
same features as MMF1 but with BP network as classifier 
and DS rule of combination. The recognition performances are 
shown in Table IV. One sees clearly the advantage to use the 
image sequence processing with HMMs because of significant 
improvement of ATR performances. The recognition rate of 
MFF2 declines seriously because the convergence of the BP 
network is not good enough. 



Target type 


1 


2 


3 


4 


5 


6 


7 


Ri (SMF-ATR) 


95.7 


93.5 


96.3 


98.2 


96.3 


98.5 


97.3 


Ri (MFF1) 


89.2 


92.0 


91.2 


86.9 


92.2 


93.5 


95.0 


Ri (MFF2) 


64.9 


51.6 


82.8 


82.2 


70.8 


48.3 


58.9 



TABLE IV: Performances (in %) with and without HMMs. 



E - SMF-ATR versus SSF-ATR 

We have also compared in Table V the performances 
SMF-ATR with those of two simple SSF-ATR 26 methods, 
called SSF1-ATR and SSF2-ATR. The SSF1-ATR uses only 
Hu’s moments features whereas SSF2-ATR uses only SVD 
of outline as features. SSF1-ATR exploits image sequence 
information using BP networks as classifier and DS rule for 
combination, while SSF2-ATR uses PNN and PCR5/6 rule. 

25 According to the proportion of the two types in the whole sequence. 

26 SSF-ATR stands for Single-feature Sequence Automatic Target Recogni- 
tion. 
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Target type 


1 


2 


3 


4 


5 


6 


7 


Ri (SMF-ATR) 


95.7 


93.5 


96.3 


98.2 


96.3 


98.5 


97.3 


Ri (SFF1-ATR) 


39.3 


42.3 


74.3 


56.7 


60.1 


33.9 


44.3 


Ri (SFF2-ATR) 


88.8 


66.4 


86.7 


66.9 


73.6 


52.9 


63.8 



TABLE V: Performances (in %) of SMF-ATR and SFF-ATR. 



One clearly sees the serious advantage of SMF-ATR with 
respect to SFF-ATR due to the combination of information 
drawn from both kinds of features (Hu’s and SVD of outline) 
extracted from the images. 

IV. Conclusions and perspectives 

A new SMF-ATR approach based on features extraction has 
been proposed. The extracted features from binary images feed 
PNNs for building basic belief assignments that are combined 
with DSmT PCR rule to make a local (based on one image 
only) decision on target type. The set of local decisions ac- 
quired over time for the image sequence feeds HMMs to make 
the final recognition of the target. The evaluation of this new 
SMF-ATR approach has been done with realistic sequences 
of aircraft observations. SMF-ATR is able to achieve higher 
recognition rates than classical approaches that do not exploit 
HMMs, or SSF-ATR. Another complementary analysis of the 
robustness of SMF-ATR to target occultation is currently under 
progress and will be published in a forthcoming paper. Our 
very preliminary results based only on few sequences indi- 
cate that SMF-ATR seems very robust to target occupations 
occurring randomly in single (non consecutive) images, but a 
finer analysis based on Monte-Carlo simulation will be done 
to evaluate quantitatively its robustness in different conditions 
(number of consecutive occupations in the sequences, the level 
of occupation, etc). As interesting perspectives, we want to 
extend SMF-ATR approach for detecting new target types that 
are not included in image data set. Also, we would want to 
deal with the recognition of multiple crossing targets observed 
in a same image sequence. 
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